AITopics | interestingness measure

Collaborating Authors

interestingness measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning Interestingness in Automated Mathematical Theory Formation

Neural Information Processing SystemsJun-16-2026, 07:46:51 GMT

We take two key steps in automating the open-ended discovery of new mathematical theories, a grand challenge in artificial intelligence. First, we introduce FERMAT, a reinforcement learning (RL) environment that models concept discovery and theorem-proving using a set of symbolic actions, opening up a range of RL problems relevant to theory discovery. Second, we explore a specific problem through FERMAT: automatically scoring the interestingness of mathematical objects. We investigate evolutionary algorithms for synthesizing nontrivial interestingness measures. In particular, we introduce an LLM-based evolutionary algorithm that features function abstraction, leading to notable improvements in discovering elementary number theory and finite fields over hard-coded baselines.

evolutionary algorithm, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.71)
(4 more...)

Add feedback

Efficiently Sampling Interval Patterns from Numerical Databases

Bekkoucha, Djawad, Diop, Lamine, Ouali, Abdelkader, Crémilleux, Bruno, Boizumault, Patrice

arXiv.org Artificial IntelligenceDec-2-2025

Pattern sampling has emerged as a promising approach for information discovery in large databases, allowing analysts to focus on a manageable subset of patterns. In this approach, patterns are randomly drawn based on an interestingness measure, such as frequency or hyper-volume. This paper presents the first sampling approach designed to handle interval patterns in numerical databases. This approach, named Fips, samples interval patterns proportionally to their frequency. It uses a multi-step sampling procedure and addresses a key challenge in numerical data: accurately determining the number of interval patterns that cover each object. We extend this work with HFips, which samples interval patterns proportionally to both their frequency and hyper-volume. These methods efficiently tackle the well-known long-tail phenomenon in pattern sampling. We formally prove that Fips and HFips sample interval patterns in proportion to their frequency and the product of hyper-volume and frequency, respectively. Through experiments on several databases, we demonstrate the quality of the obtained patterns and their robustness against the long-tail phenomenon.

artificial intelligence, interval pattern, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2512.00105

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (0.70)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SubROC: AUC-Based Discovery of Exceptional Subgroup Performance for Binary Classifiers

Siegl, Tom, Coşkun, Kutalmış, Hiller, Bjarne C., Mirzaei, Amin, Lemmerich, Florian, Becker, Martin

arXiv.org Artificial IntelligenceAug-28-2025

Machine learning (ML) is increasingly employed in real-world applications like medicine or economics, thus, potentially affecting large populations. However, ML models often do not perform homogeneously, leading to underperformance or, conversely, unusually high performance in certain subgroups (e.g., sex=female AND marital_status=married). Identifying such subgroups can support practical decisions on which subpopulation a model is safe to deploy or where more training data is required. However, an efficient and coherent framework for effective search is missing. Consequently, we introduce SubROC, an open-source, easy-to-use framework based on Exceptional Model Mining for reliably and efficiently finding strengths and weaknesses of classification models in the form of interpretable population subgroups. SubROC incorporates common evaluation measures (ROC and PR AUC), efficient search space pruning for fast exhaustive subgroup search, control for class imbalance, adjustment for redundant patterns, and significance testing. We illustrate the practical benefits of SubROC in case studies as well as in comparative analyses across multiple datasets.

artificial intelligence, machine learning, subgroup, (18 more...)

arXiv.org Artificial Intelligence

2505.11283

Country: North America > United States (0.93)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

Add feedback

ILAEDA: An Imitation Learning Based Approach for Automatic Exploratory Data Analysis

Manatkar, Abhijit, Patel, Devarsh, Patel, Hima, Manwani, Naresh

arXiv.org Artificial IntelligenceOct-15-2024

Automating end-to-end Exploratory Data Analysis (AutoEDA) is a challenging open problem, often tackled through Reinforcement Learning (RL) by learning to predict a sequence of analysis operations (FILTER, GROUP, etc). Defining rewards for each operation is a challenging task and existing methods rely on various \emph{interestingness measures} to craft reward functions to capture the importance of each operation. In this work, we argue that not all of the essential features of what makes an operation important can be accurately captured mathematically using rewards. We propose an AutoEDA model trained through imitation learning from expert EDA sessions, bypassing the need for manually defined interestingness measures. Our method, based on generative adversarial imitation learning (GAIL), generalizes well across datasets, even with limited expert data. We also introduce a novel approach for generating synthetic EDA demonstrations for training. Our method outperforms the existing state-of-the-art end-to-end EDA approach on benchmarks by upto 3x, showing strong performance and generalization, while naturally capturing diverse interestingness measures in generated EDA sessions.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2410.11276

Country:

North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.05)
Asia > India > Telangana > Hyderabad (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Automated Question Generation on Tabular Data for Conversational Data Exploration

Chaudhuri, Ritwik, C, Rajmohan, DB, Kirushikesh, Agarwal, Arvind

arXiv.org Artificial IntelligenceJul-10-2024

Exploratory data analysis (EDA) is an essential step for analyzing a dataset to derive insights. Several EDA techniques have been explored in the literature. Many of them leverage visualizations through various plots. But it is not easy to interpret them for a non-technical user, and producing appropriate visualizations is also tough when there are a large number of columns. Few other works provide a view of some interesting slices of data but it is still difficult for the user to draw relevant insights from them. Of late, conversational data exploration is gaining a lot of traction among non-technical users. It helps the user to explore the dataset without having deep technical knowledge about the data. Towards this, we propose a system that recommends interesting questions in natural language based on relevant slices of a dataset in a conversational setting. Specifically, given a dataset, we pick a select set of interesting columns and identify interesting slices of such columns and column combinations based on few interestingness measures. We use our own fine-tuned variation of a pre-trained language model(T5) to generate natural language questions in a specific manner. We then slot-fill values in the generated questions and rank them for recommendations. We show the utility of our proposed system in a coversational setting with a collection of real datasets.

dataset, operator, salary, (13 more...)

arXiv.org Artificial Intelligence

2407.12859

Country:

North America > United States > New York (0.05)
Asia > India (0.05)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.67)

Add feedback

Enhancing Actionable Formal Concept Identification with Base-Equivalent Conceptual-Relevance

Bobi, Ayao, Missaoui, Rokia, Ibrahim, Mohamed Hamza

arXiv.org Artificial IntelligenceDec-21-2023

In knowledge discovery applications, the pattern set generated from data can be tremendously large and hard to explore by analysts. In the Formal Concept Analysis (FCA) framework, there have been studies to identify important formal concepts through the stability index and other quality measures. In this paper, we introduce the Base-Equivalent Conceptual Relevance (BECR) score, a novel conceptual relevance interestingness measure for improving the identification of actionable concepts. From a conceptual perspective, the base and equivalent attributes are considered meaningful information and are highly essential to maintain the conceptual structure of concepts. Thus, the basic idea of BECR is that the more base and equivalent attributes and minimal generators a concept intent has, the more relevant it is. As such, BECR quantifies these attributes and minimal generators per concept intent. Our preliminary experiments on synthetic and real-world datasets show the efficiency of BECR compared to the well-known stability index.

formal concept, minimal generator, stability, (13 more...)

arXiv.org Artificial Intelligence

2312.14421

Country:

North America > Canada > Quebec (0.06)
North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A comprehensive review of visualization methods for association rule mining: Taxonomy, Challenges, Open problems and Future ideas

Fister, Iztok Jr., Fister, Iztok, Fister, Dušan, Podgorelec, Vili, Salcedo-Sanz, Sancho

arXiv.org Artificial IntelligenceFeb-24-2023

Association rule mining is intended for searching for the relationships between attributes in transaction databases. The whole process of rule discovery is very complex, and involves pre-processing techniques, a rule mining step, and post-processing, in which visualization is carried out. Visualization of discovered association rules is an essential step within the whole association rule mining pipeline, to enhance the understanding of users on the results of rule mining. Several association rule mining and visualization methods have been developed during the past decades. This review paper aims to create a literature review, identify the main techniques published in peer-reviewed literature, examine each method's main features, and present the main applications in the field. Defining the future steps of this research area is another goal of this review paper.

artificial intelligence, association rule, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.12594

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Slovenia > Drava > Municipality of Maribor > Maribor (0.04)
Asia > Singapore (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Explainable Subgraphs with Surprising Densities: A Subgroup Discovery Approach

Deng, Junning, Kang, Bo, Lijffijt, Jefrey, De Bie, Tijl

arXiv.org Machine LearningJan-10-2020

The connectivity structure of graphs is typically related to the attributes of the nodes. In social networks for example, the probability of a friendship between two people depends on their attributes, such as their age, address, and hobbies. The connectivity of a graph can thus possibly be understood in terms of patterns of the form 'the subgroup of individuals with properties X are often (or rarely) friends with individuals in another subgroup with properties Y'. Such rules present potentially actionable and generalizable insights into the graph. We present a method that finds pairs of node subgroups between which the edge density is interestingly high or low, using an information-theoretic definition of interestingness. This interestingness is quantified subjectively, to contrast with prior information an analyst may have about the graph. This view immediately enables iterative mining of such patterns. Our work generalizes prior work on dense subgraph mining (i.e. subgraphs induced by a single subgroup). Moreover, not only is the proposed method more general, we also demonstrate considerable practical advantages for the single subgroup special case.

background distribution, graph, subgroup, (17 more...)

arXiv.org Machine Learning

2002.00793

Country:

North America > United States > California (0.14)
Asia > China (0.05)
Europe > Netherlands (0.04)
(24 more...)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

Information-theoretic Interestingness Measures for Cross-Ontology Data Mining

Manda, Prashanti, McCarthy, Fiona, Nanduri, Bindu, Wang, Hui, Bridges, Susan M.

arXiv.org Artificial IntelligenceMay-16-2016

Community annotation of biological entities with concepts from multiple bio-ontologies has created large and growing repositories of ontology-based annotation data with embedded implicit relationships among orthogonal ontologies. Development of efficient data mining methods and metrics to mine and assess the quality of the mined relationships has not kept pace with the growth of annotation data. In this study, we present a data mining method that uses ontology-guided generalization to discover relationships across ontologies along with a new interestingness metric based on information theory. We apply our data mining algorithm and interestingness measures to datasets from the Gene Expression Database at the Mouse Genome Informatics as a preliminary proof of concept to mine relationships between developmental stages in the mouse anatomy ontology and Gene Ontology concepts (biological process, molecular function and cellular component). In addition, we present a comparison of our interestingness metric to four existing metrics. Ontology-based annotation datasets provide a valuable resource for discovery of relationships across ontologies. The use of efficient data mining methods and appropriate interestingness metrics enables the identification of high quality relationships.

artificial intelligence, ontology, transaction, (15 more...)

arXiv.org Artificial Intelligence

1504.08027

Country:

North America > United States > Arizona (0.28)
North America > United States > Alabama (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback

Standardizing Interestingness Measures for Association Rules

Shaikh, Mateen, McNicholas, Paul D., Antonie, M. Luiza, Murphy, T. Brendan

arXiv.org Machine LearningAug-16-2013

Interestingness measures provide information that can be used to prune or select association rules. A given value of an interestingness measure is often interpreted relative to the overall range of the values that the interestingness measure can take. However, properties of individual association rules restrict the values an interestingness measure can achieve. An interesting measure can be standardized to take this into account, but this has only been done for one interestingness measure to date, i.e., the lift. Standardization provides greater insight than the raw value and may even alter researchers' perception of the data. We derive standardized analogues of three interestingness measures and use real and simulated data to compare them to their raw versions, each other, and the standardized lift.

artificial intelligence, expert system, interestingness measure, (17 more...)

arXiv.org Machine Learning

1308.374

Country:

North America > Canada (0.68)
North America > United States (0.67)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.91)

Add feedback